Hiding solutions in random satisfiability problems: A statistical mechanics approach 
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A major problem in evaluating stochastic local search algorithms for NP-complete problems is 
the need for a systematic generation of hard test instances having previously known properties of 
the optimal solutions. On the basis of statistical mechanics results, we propose random generators 
of hard and satisfiable instances for the 3-satisfiability problem (3-SAT). The design of the hardest 
problem instances is based on the existence of a first order ferromagnetic phase transition and the 
glassy nature of excited states. The analytical predictions are corroborated by numerical results 
obtained from complete as well as stochastic local algorithms. 
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In natural sciences and in artificial systems, there ex- 
ist many problems whose solution requires computational 
resources growing exponentially with the number of vari- 
ables N needed for their encoding. Concrete examples 
are optimization and cryptographic problems in com- 
puter science, glassy systems and random structures in 
physics and chemistry, random graphs in mathematics, 
and scheduling problems in real-world applications. 

Having fast and powerful algorithms for the resolution 
of these problems is of primary relevance for their theo- 
retical study as well as for applications. The evaluation 
of such algorithms is based on the availability of hard 
benchmarks, having the following properties: They pro- 
vide problem instances, with a given known solution, in 
a fast way (e.g. linear in N), but the resolution of an 
instance takes a time exponential in N for any known 
algorithm. So the best algorithms can be easily selected. 
In this letter we propose a new generator of hard and 
solvable test instances, having all the properties listed 
above. It is based on a NP-complcte problem 0] - namely 
3-satisfiability (3-SAT). 

The main idea for the construction of such hard and 
solvable problems is very simple: to hide a known solu- 
tion within a multitude of coexisting random meta-stable 
configurations which constitute dynamical barriers. In 
the physical approach based on a mapping from 3-SAT 
to a spin glass model |^ , such random configurations cor- 
respond to glassy states Q. It is to be noted, however, 
that many previous attempts to implement this idea were 
unsuccessful, because the random structure was usually 
easy to remove, or knowledge that a solution has been 
forced can be exploited to find it. In the instances we 
propose, instead, the presence of a known solution does 
not alter the structure of the glassy state, which confuses 
the solver and makes the problem hard. 

As an important application of these ideas to cryp- 
tography Q, random one-way functions are provided: A 
given message, e.g. a password, can be coded in a 3-SAT 



formula and thus verified efficiently, but decoding it is 
extremely time-consuming. 

We use the framework of the typical-case computa- 
tional complexity j5[ There, the study of random 3- 
SAT problems has played a major role. A random 3-SAT 
formula F consists of Af logical clauses {C^}^=i....,j\/ over 
a set of N Boolean variables {xi = 0, with 
0=FALSE and 1=TRUE. Every clause consists of three 
randomly chosen Boolean variables which are connected 
by logical OR operations (V) and appear negated with 
probability 1/2, e.g. = {xi V x j V Xk). In F the 
clauses are connected by logical AND operations (A), 
F = Cfj, , so that all clauses have to be satisfied 

simultaneously in order to satisfy the formula. 

A satisfying logical assignment of the Xi is also called 
a solution of F. The random 3-SAT model was found to 
undergo a SAT/UNSAT phase transition § at a critical 
ratio ac = M/N 4.25 {N > 1): Below ac, almost ah 
formulae are satisfiable, while beyond almost all formu- 
lae do not show any solution. At this threshold, a strong 
exponential peak in the typical (median) cost for finding 
solutions by the best known algorithms appears. Prob- 
lem instances generated close to it form a natural test 
bed for the optimization of heuristic search algorithms. 
However, satisfiable and unsatisfiable instances coexist 
in this region. Many algorithms of practical interest ||] 
are based on incomplete stochastic local search proce- 
dures, as e.g. simulated annealing 1^ and the walk-SAT 
algorithm [0. These algorithms stop once they have 
found a solution, but they have no way to disentangle, 
in polynomial time in A'^, if a formula is unsatisfiable or 
just hard to solve. It is thus very important to generate 
benchmarks which are satisfiable and for which the al- 
gorithmic proof of this satisfiability takes an exponential 
time in N. 

In this Letter, we propose simple and fast generators of 
such benchmark problems. The main ideas are inspired 
by physical requirements, and exploit the presumed hard- 
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ness of random 3-SAT itself. One obvious possibility ^ 
is to filter the problems at the phase boundary by com- 
plete algorithms, and to keep only the satisfiable ones. 
This method is limited by the small values of N and AI 
which can be handled by the filtering algorithms, thus 
making the generation itself exponentially long. In ad- 
dition, the hardest instances are the unsatisfiable ones. 
Other approaches use mappings from various hard prob- 
lems to 3-SAT, including e.g. factorizatio n fn\ , graph 
coloring jl2j, and Latin square completion ]13|]. 

We choose an arbitrary assignment of our logical vari- 
ables and accept, with some prescribed probability, only 
clauses which are satisfied by this assignment. Without 
loss of generality, we restrict ourselves to generating for- 
mulae which are satisfied by xf^ = 1, Vi 1, ..,7V @. 
So, only clauses containing three negated variables are 
excluded; all other clauses are satisfied by . The gen- 
eration of random 3-SAT formulae is done as follows: For 
each of the M = aN clauses, we draw randomly and in- 
dependently three indices fc G {1,...,7V}. Then, we 
choose one of the seven allowed clauses with the following 
probabilities: clause (xiVXjVXk) - type "0" - with prob- 
ability po ; each of the clauses {xiV XjV Xk), {xi WxjVxk) 
and {xi y Xj Wxk) - type "1" - with probability pi; and 
finally each of {xiVxjV Xk) , (xiVXjVxk) and (xiVxjWxk) 
-type "2" - with probability P2, where po+3pi+3p2 = 1. 
As we will show in the following, typically hard instances 
can be generated if the parameters are chosen as follows: 

a > 4.25 , 0.077 <po < 0.25 , 

Pi = {l~4po)/6, p2 = (l + 2po)/6 . (1) 

To understand this model, and to find values for po^ pi 
and p2 such that the instances are as hard as possible, we 
have followed a statistical mechanics approach corrobo- 
rated by numerical simulations based on both complete 
and randomized algorithms. The analysis is based on the 
standard representation of 3-SAT as a diluted spin-glass 
model 1^: The Boolean variables Xi = 0, 1 are mapped 
to Ising spins Si = (—1)^% and the Hamiltonian counts 
the number of unsatisfied clauses, 

a ^ 

Ti = —N —'^^HiSi—'^^TijSiSj— ^ JijkSiSjSk (2) 

i=l i<j i<j<k 

with Hi — ^'^fj^Cfi^i, Tij — ~^^f^Cfi,iCfi,j, and Jijk — 
I c^,iCM,jC^,fe' where c^,; equals -1-1 if x^ appears di- 
rectly in C^, —1 if it appears negated, and other- 
wise. The interactions in fluctuate from sample to 
sample, with disorder- averages Hi — ^{po + Pi — P2), 
^i'J = If (--P0+P1+P2), and Jyfc = ■^{po-2,pi + ip2). 

We are interested in the ground states of this Hamil- 
tonian. For a satisfiable formula we know that the corre- 
sponding ground state energy vanishes. In order to an- 
alytically characterize the ground states properties, we 
first calculate the free energy at formal temperature T, 



using the functional replica trick in the replica-symmetric 
framework Then we send T ^ 0, and we study the 
zero-temperature phase diagram of the model using a 
and Po,i.2 as control parameters. 

The replica-symmetric order parameter determining 
the different phases of the system is the distribution of 
local magnetizations P{m) = 1/A^ ^("^ ~ "rra), where 
rrii — {Si)T=Q is the average value of Si over all ground 
states. There are mainly two different cases: 
(?) P{m) has a non-zero average and/or is broad, but all 
\mi\ are less than 1. It can be determined using a simple 
population dynamics algorithm or variationally Jl6[ . 
Both results coincide. 

(m) P{m) can be calculated exactly and turns out to have 
a finite weight in to = 1, i.e. an extensive number of vari- 
ables is fixed to Xi ~ 1 in all satisfying assignment (the 
so called backbone Q). 

Going back to the class of generators proposed above, 
one could naively use po = Pi = P2 = 1/7 {model 1/7), 
choosing any of the allowed clauses with the same 
probability. This generator, including some extensions, 
p8| , p^ , pO{ is known to be effectively solvable by local 
search procedures . In our walk-SAT implementation, 
the maximal resolution-time [|T] grows like t oc N^-^^, 
and large systems of sizes up to ~ 10"* can be easily 
handled. 

The statistical mechanics approach clarifies this result: 
The proposed generator behaves like a paramagnet in an 
exterior random field, and no ferromagnetic phase tran- 
sition appears. Local search algorithms may exploit the 
average local field Hi = 3q:/56 pointing into the direction 
of the forced solution x'"^ , and rapidly find a solution. 

To avoid this, we can fix the average local field to zero 
by choosing Po + Pi — P2 = 0. The probabilities are 
thus restricted by < po < 1/4, Pi = (1 — 4po)/6, and 
P2 = (1 + 2po)/6. 

Let us start the discussion of these possibilities with 
the case po = 0, pi = p2 = 1/6 {model 1/6). In this 
(and only this) case, there is a second guaranteed solu- 
tion: Xi — 0, Vi. The average Jijk vanishes, too. The 
model is paramagnetic at low a, and undergoes a second 
order ferromagnetic transition at a ~ 3.74 (see full line 
in Fig. [1]) . But also in the ferromagnetic phase the back- 
bone is still zero as long as a ^ 4.91: At this point it 
appears continuously from strongly magnetized spins. 

In walk-SAT experiments, we find that the generated 
instances are still solvable in polynomial time, with peak 
resolution-times growing as N"^-^, see Fig. |2[ However, 
the complexity peak is not at the phase transition, but 
quite close to the critical point of random 3-SAT. This 
is due to the fact that walk-SAT does not sample solu- 
tions according to the thermodynamic equilibrium distri- 
bution: Most probably it hits solutions with small mag- 
netization, i.e. closer to the starting point (see Fig. |^). 
For N GO, this magnetization stays zero even after the 
ferromagnetic transition. Indeed, if we restrict the sta- 
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FIG. 1: Magnetization of the first walk-SAT solution in model 
1/6. Due to the (average) spin-flip symmetry, we plot the 
average of |m|. For large A'^, the magnetization stays zero up 
to a ~ 4.1. The full line shows the thermodynamic average, 
which stays well above the asymptotic walk-SAT result. 



FIG. 3: Average magnetization of solutions of model 1/4, 
obtained with a complete algorithm. There, the magnetiza- 
tion equals the backbone size. The finite size curves cross at 
a ~ 4.25, and tend to the analytical prediction. The dotted 
continuation of the analytical line gives the globally unstable 
ferromagnetic solution, starting at the spinodal point. 
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FIG. 2: Typical walk-SAT complexity for model 1/6. We 
show the average value of \og{t/N). We find a clear data col- 
lapse for small a in the linear regime, t oc N. The complexity 
peak at a ~ 4.1 grows polynomially as shown in the inset. 
The slope of the straight line in the inset is 1.3. 



tistical mechanics analysis to zero magnetization, we find 
an exponential number of solutions also beyond a = 3.74. 
More interestingly, this number coincides with the one of 
random 3-SAT, which jumps to at a ~ 4.25 ||]. So, ap- 
proaching this point, walk-SAT is no longer able to find 
unmagnetized solutions for model 1/6, and it has to go 
to magnetized assignments, giving rise to the resolution- 
time peak. 

Once we use po > 0, the situation changes: The 
ferromagnetic transition becomes first order, as can be 
seen best by the existence of metastable solutions for 
P{m). The transition point moves towards the random 
3-SAT threshold etc, and the computational complexity 



increases with pq. Still, for po ^ 0.077, the ferromag- 
netic phase arises without backbone and solutions can 
be easily found. 

In the region 0.077 ^ po < 1/4, the first order transi- 
tion is more pronounced. The system jumps at a ~ 4.25 
from a paramagnetic phase to a ferromagnetic one, with a 
discontinuous appearance of a backbone: Forpo — 0.077, 
the backbone size at the threshold is about 0.72A^, and 
goes up to 0.94iV for po = 1/4 (see Fig. ||). We con- 
jecture the ferromagnetic critical point in these models 
to coincide with the SAT/UNSAT threshold in random 
3-SAT, since the topological structures giving rise to fcr- 
romagnetism in the formers induce frustration and thus 
unsatisfiability in the latter. 

The case po = 1/4, and so pi =0, P2 = 1/4 {model 1/4), 
is very peculiar because it can always be solved in poly- 
nomial time using a global algorithm. Indeed, one can 
unambiguously add three clauses to every existing one, 
namely the other clauses allowed in model 1/4, without 
loosing the satisfiability of the enlarged formula |2^ . The 
completed formula becomes a sample of random satisfi- 
able 3-XOR-SAT (also known as hyper-SAT (23)), which 
can be mapped to a system of linear equations modulo 
2, and solved in time of 0{N^) [|||. 

This algorithm immediately breaks down if we choose 
Po ^ 1/4. Indeed, whenever one tries to map the general 
formula into a completed one, the presence of all three 
types of clauses forces it into a frustrated 3-XOR-SAT 
formula, which undergoes a SAT/UNSAT transition at 
a = 0.918 |2^, well below the region of our interest. So 
the mapping is of no use for po 7^ 1/4. In this case, any 
3-SAT instance with solution x^^^ (and thus any solvable 
one p4|) can be generated with non-zero probability. The 
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FIG. 4: Typical walk-SAT complexity for model 1/4. The 
complexity peak is much more pronounced than in Fig. cf. 
e.g. the reachable system sizes. The inset shows the exponen- 
tial resolution-time scaling near the peak (a = 4.6) and deep 
inside the ferromagnetic phase (a = 7.0). The slopes of the 
straight lines are 0.075 and 0.04 respectively. 



worst-case is thus included in the presented generator, 
and there cannot be any polynomial solver if P^^NP. 

In the following table we summarize the main results 
for the investigated combinations of po, pi andp2- Where 
only po is reported, pi_2 are given by Eqs.(||). We show 
the location ttc and order of the ferromagnetic phase tran- 
sition, together with the point a^s and the system-size- 
scaling (P/EXP) of the maximal walk-SAT complexity. 
For comparison, we have added the corresponding data 
for random 3-SAT. 



Model 


ttc (order, type) 




P0,l,2 = 1/7 


NO 


5.10 P 


Po = 


3.74 (2nd, ferro) 


4.10 P 


PO G [0.077, 1/4) 


4.25 (1st, ferro) 


4.25 EXP 


Po = 1/4 


4.25 (1st, ferro) 


4.25 P 


Random 3-SAT 


4.25 (SAT/UNSAT) 


4.25 EXP 



Please note, that the polynomial time-complexity of 
model 1/4 is accidental and due to the existence of a 
global algorithm, whereas the walk-SAT peak grows ex- 
ponentially with N . To corroborate this picture, we also 
performed simulated annealing experiments. We easily 
find solutions in model 1/6, but get stuck in the vicinity 
of model 1/4. 

As a conclusion, we conjecture the hardest instances 
to be generated with po values close to 1/4. The com- 
putational times for their solution are similar to those in 
Fig. ^, which have been obtained for po = 1/4 with- 
out exploiting the global algorithm. Resolution-times 
are clearly exponential in all the ferromagnetic phase 
{a > 4.25). Moreover we checked that resolution-times 



in the paramagnetic phase (a < 4.25) coincide, up to 
finite-size effects, with those of random 3-SAT. 

The physical interpretation of the hardness in this class 
of models is based on the presence of glassy metastable 
states of zero magnetization |^ for a > 4.25. These states 
are dynamically favored and trap the system for very long 
times during a stochastic local search. We believe that 
the statistical mechanics approach can have a general 
valence in the formulation of hard and solvable problems, 
allowing for a systematic way of producing random one- 
way functions, and can help in the study of the dynamics 
of randomized search algorithms. 

MW thanks the ICTP in Trieste and RZ thanks the 
LPTMS in Orsay for hospitality 
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